perm filename APP2[AIM,DBL] blob
sn#126870 filedate 1974-10-29 generic text, type T, neo UTF8
00100 .DEVICE XGP
00200 .FONT 1 "FIX25"
00300 .FONT 2 "SIGN57"
00400 .FONT 3 "SHD40"
00500 .FONT 4 "BDI25"
00600 .FONT 5 "NGB30"
00700 .FONT 6 "NGR20"
00800 .TURN ON "↓_π{"
00900 .TURN ON "⊗" FOR "%"
01000 .MACRO B ⊂ BEGIN VERBATIM GROUP ⊃
01100 .MACRO E ⊂ APART END ⊃
01200 .TABBREAK
01300 .COMPACT
01400 .EVERY FOOTING(⊗6Fourth Draft .... {DATE},page A2.{IF PAGE = 1 THEN 1 ELSE PAGE},BEINGs' knowledge⊗*)
01500 .EVERY HEADING(⊗3BEINGS⊗*,,⊗4Doug Lenat⊗*)
01600 .COUNT PAGE PRINTING "1"
01700 .NEXT PAGE
00100 .SELECT 1
00200
00300 ⊗2APPENDIX 2. ⊗* ⊗3THE BEINGS⊗*
00400
00500 Here we present summaries of the knowledge embedded in each BEING.
00600 Only the most important parts of each BEING are even mentioned.
00700 First we present those BEINGS used to write CF. Next come the ones we
00800 had to add to the pool to get PUP6 to write GI and INF. Among the
00900 first group, we further subdivide it into (a) high-level,
01000 domain-specific, (b) low-level domain-specific, (c) ubiquitous,
01100 domain-independent, (d) high-level, domain-independent, (e)
01200 low-level, domain-independent, (f) non-BEING knowledge in variables,
01300 and (g) a few interesting demons and functions. The number of
01400 additional BEINGs to write GI and INF
01500 is so small we don't subdivide their descriptions.
01600
01700 .SELECT 5
01800
01900 (i) The knowledge necessary to write a concept formation program:
02000
02100 A. High-level, domain-specific knowledge
02200
02300 .SELECT 1
02400
02500 ⊗4CONCEPT-FORMATION⊗*.
02600 Each interesting BEING part is translated into English. Just this once,
02700 the name of the part will be (parenthesized) preceding the translation.
02800 (pre-requisite:)
02900 The user must be aware that we are about to
03000 undertake concept formation.
03100 (demons:) Inference and attention-focussing demons
03200 must be turned on.
03300 (effects:) After completion of this task, PUP6 will be able
03400 to learn concepts.
03500 (generalizations:) This is a specialized form of attending, learning,
03600 and doing inductive inference.
03700 (alternatives:) It is an alternative to grammatical
03800 inference, pattern recognition, and simulated evolution.
03900 (specializations:) Its
04000 structure must be one of the following: classificatory concept
04100 formation, comparative concept formation, or metrical concept
04200 formation. We must make the boolean decision as to whether or not
04300 concepts may vary with time. Similarly, whether the speed of
04400 presentation of the stimuli is relevant; if so, then we must
04500 constrain the effort spent on various phases of identification.
04600 Instances may be left in view indefinitely, or may be removed right
04700 after processing; this latter case holds for CF; it means we must
04800 derive all relations (features) as soon as we see a scene. The
04900 program will have to be just complex enough to handle conjunctive,
05000 disjunctive, or both kinds of concepts; this is another decision to
05100 make. Similarly for positive, negative, both, or neither kinds of
05200 transfer (psychological), which affects the recognition that a
05300 concept is new, and how previously learned concepts interact with the
05400 learning of new ones. We must decide whether to use positive,
05500 negative, or both kinds of instances of a concept. Subject-specific
05600 behavior may be required; that is, the program may have to input a
05700 vector describing a particular individual, and its whole structure
05800 must mimic this subject. The last decision is one of adapting the
05900 program to an extended sample dialogue which the user must furnish;
06000 this will help both to check out the program writtten, and to fix
06100 various print statements and I/O formats.
06200 (complexity-vector:) It is easy to call this
06300 BEING (.1), it has a 50-50 chance of calling* itself, it has only a
06400 0.5 chance of succeeding, but the effort to try it is moderate (.5).
06500 There is no fundamental reason for delaying its investigation (.1).
06600 (iden:) It recognizes
06700 itself only through exact matching of one of seven
06800 patterns. (what, how, why:)
06900 It has sentences describing what it does, how, and why. (when:) It
07000 is unlikely (-70) to be called if some type of concept formation is
07100 already doable by PUP6; if PUP6 wants to characterize classes, then
07200 it's very likely (88) to be called. The presence of new information
07300 delays (-60) our calling the BEING, since it might affect what we
07400 should do. Conversely, the absence of new information mildly (40)
07500 encourages us to go on and try it.
07600 (post:requisites:) When finished, the user must be
07700 aware that PUP6 has decided on a particular type of concept
07800 formation, and that it has done it. (affects:)
07900 The other BEINGs affected depend
08000 on the decisions mentioned earlier.
08100
08200 This is an over-abundance of information. From now on, I will
08300 only give the little pieces of information which are crucial to the
08400 BEING; its essence. Also, the BEING part names won't be explicitly
08500 provided.
08600
08700 ⊗4CLASSIFICATORY CONCEPT FORMATION⊗*. To do this, we must partition a
08800 domain, in an interactive "guessing" manner.
08900
09000 ⊗4COMPARITIVE CONCEPT FORMATION⊗*. Same as above, but then we must
09100 partially order the equivalence classes of the partition. It is
09200 harder, also.
09300
09400 ⊗4METRICAL CONCEPT FORMATION⊗*. Same as previous BEING, but we must
09500 also induce a metric on the partial ordering of the classes. This is
09600 even more complicated.
09700
09800 Since we actually do classificatory CF, the BEINGs to order a
09900 partition and to metrize an ordering weren't implemented.
10000
10100 ⊗4PARTITION A DOMAIN⊗* in a guessing, interactive manner. The
10200 partition may be only partial, it may be only weak, and, most
10300 crucially, the BEING must be able to do some of these: partition by
10400 accepting an element and getting its class name (guessing the name
10500 and then checking it somehow), partition by accepting a class name
10600 and getting its member elements, partition by accepting ordered pairs
10700 <element, classname>. The fringe of consciousness demon must be
10800 activated from now on.
10900
11000 ⊗4PARTITION BY TAKE ELEMENT GET CLASS⊗*. Take hold of an element (by
11100 reading, for example), and then work to get the name of the class it
11200 belongs to (by guessing, then verifying the guess, for example). Then
11300 modify the structure of the class(es) involved.
11400
11500 ⊗4PARTITION BY TAKE CLASS GET ELEMENT⊗*. Take hold of a class name,
11600 and then work to get elements it contains. Then modify the structure
11700 of the class and the element(s) involved.
11800
11900 ⊗4PARTITION BY TAKE ELEMENT AND CLASS⊗* simultaneously. Take hold of
12000 both an element and its corresponding class name, and use these to
12100 modify the structure of the partition; i.e., modify the class
12200 mentioned if the partition is stored by classes.
12300
12400
12500 ⊗5B. Low-level, domain-specific knowledge⊗*
12600
12700 ⊗4RECOGNIZE CONTRADICTION⊗*. It can translate "... is incompatible
12800 with ...". It is a predicate, fairly easy to write therefore. It is
12900 composed of SOMEOF the following: Probability≡0 contradiction,
13000 Probability≡1 contradiction, Probability >0 and <1 contradiction.
13100 Since these names are fairly cryptic, some of their parts (e.g, HOW)
13200 must be printed out to help the user choose (whenever they are
13300 involved, if the user asks for ramifications.)
13400
13500 ⊗4PROBABILITY ≡ 0 CONTRADICITON⊗*. Since this is a very simple thing
13600 in our domain of concept formation, it is immediately encodable as
13700 (MEMBER arg1 arg2). That is, if arg1 has probability zero of
13800 occurring in arg2, yet it does, then we have a contradiction.
13900
14000 ⊗4PROBABILITY ≡ 1 CONTRADICTION⊗*. Immediately encodable as (NOT
14100 (MEMBER arg1 arg2)). If arg1 has probability one of occurring in
14200 arg2, yet it doesn't, then we have a contradiction.
14300
14400 ⊗4PROBABILITY >0 AND <1 CONTRADICTION⊗*. Immediately enacodable as
14500 NIL. If arg1 might and might not occur in arg2, we can't get a
14600 contradiction just by checking its membership. Of course, the idea
14700 that this is the only way to prove contradiction is what makes these
14800 BEINGs domain-specific!
14900
15000 ⊗4SCENE⊗*. This is a data structure, composed of four subparts. The
15100 first is a set O of objects. Next is an atom indicating the name N of
15200 the scene. Next come two lists of features, where a feature is just a
15300 predicate and its arguments. The first is the static relations S
15400 between objects. Finally we have the dynamic relations D between
15500 objects.
15600
15700 ⊗5C. Ubiquitous, problem-independent BEINGs and functions⊗*
15800
15900 ⊗4CHOOSE FROM⊗*. All its arguments must be BEINGs, else it prints a
16000 nasty warning message. We select the best BEING and apply it. If it
16100 fails, we re-order the remaining BEINGs and apply the best one of
16200 them. Note that this new reordering may use knowledge gleaned during
16300 the earlier, unsuccessful BEING try. Thus, this is a (possible)
16400 intelligent nondeterministic branch point. The intelligence lies (or
16500 fails to lie) in the comparison function, BETTER, which decides who
16600 goes next.
16700
16800 ⊗4SATISFY⊗*. This is the equivalent of a pattern-directed goal
16900 statement. We ask each BEING, "Can you do anything matching x?". We
17000 take the list of those answering affirmatively, and CHOOSE FROM it
17100 one BEING after another until the desired effects are realized.
17200 Notice that a BEING who said "probably" may succeed in his
17300 application and yet not effect the result we wanted, so that a
17400 trivial call on CHOOSE FROM is insufficient. The BEINGs possibly
17500 affected are just those answering affirmatively.
17600
17700 ⊗4MESSAGE⊗*. This BEING has a main effect (AWARE USER x), hence is
17800 very frequently called. The forgetful-user demon trims the aware
17900 user list periodically. Message looks to see if its message is on
18000 that list; if not, it inserts it and prints it out to the user. If it
18100 is, it moves the message to the front of the aware-user list and
18200 prints out nothing. This is an example of a specialized, fixed
18300 assertion-list, as described earlier.
18400
18500 ⊗4DETERMINE ARG VALUES⊗*. These functions take as input the name of a
18600 function, and output a description of what arguments it will ever be
18700 called with (in the existing code.) For example, it might say "arg1
18800 will always be NAME:OF:CLASS, and arg2 will consecutively be all
18900 integers from 3 to (LENGTH SET:OF:CLASSES)". At present these work in
19000 the obvious way, looking at everything. The tremendous amount of CPU
19100 time spent in these functions indicates that I should have made
19200 special assert:lists for argument instantiations, and updated them
19300 each time a BEING is called in the target code.
19400
19500 ⊗4FLOW-PRECEDED⊗*. This BEING must search through he code to find a
19600 form matching a given pattern. Although it is used under ten times
19700 in the total dialog, it is so costly that I've implemented it as an
19800 ask-the-user call. Work must be done here to understand why this is
19900 so inefficient, and to remedy it.
20000
20100 ⊗4FIND AND TAG⊗*. This BEING is similar to flow-preceded, except that
20200 the pattern-matching is between two constant strings. This is
20300 tolerably efficient in CPU time and is used heavily throughout the
20400 writing of CF.
20500
20600 ⊗4TRANSLATE⊗*. The natural language front end is managed by this
20700 BEING. It asks each BEING whether it recognizes a given string.
20800 Translate then takes the "best" -- the most probable -- of these as
20900 the translation, and can backtrack and reorder the remaining
21000 interpretations if it has to. If called with no argument, it examines
21100 various assert-lists to see if it can do any good. The idiom demon
21200 must be activated during the control period of this BEING.
21300
21400 ⊗4REINVESTIGATE DECISION⊗*. This is usaully called by a demon who
21500 watches the deferred-decision assert list. We transfer the decision
21600 in question from the deferred to the undeferred decision assert list.
21700 A deferral demon will promptly react to anything on this latter list.
21800 An interesting caution: it was necessary to inhibit all demons during
21900 the execution of this BEING, for reasons very similar to those
22000 leading to lock and unlock system commands. The fact that some BEING
22100 might have to be demon-uninterruptable forced us to institute an
22200 entire new question asking just about this tiny point!
22300
22400 ⊗4DEFER DECISION⊗*. Remove the decision from the undeferred decision
22500 assert list. Determine the situation when we must next reinvestigate
22600 this decision. This will be some predicate examing the state of the
22700 world. If this predicate is true currently, we must resolve the
22800 decision now. Otherwise, we put the decision on the deferred decision
22900 list, attached to its (new) reinvestigation predicate. Demons must be
23000 inhibited during this BEING's reign, to ensure that its notions of
23100 the world are accurate upon exit.
23200
23300 ⊗4WHEN NEXT⊗*. Manipulate the decision to extract the name of the
23400 variable holding information relevant to deferring the decision.
23500 Utilize this knowledge so as to keep the effects of the decision
23600 irrelevant; i.e., find the (next) situation in which they are not
23700 irrelevant. Whoever called this BEING is now asserted to be aware of
23800 its results. This is fairly complex, and the BEING is not called
23900 unless it is necessary. As it happens, it is called a few times for
24000 every decision to be made about every BEING in the target program
24100 (several hundred times).
24200
24300 ⊗4UTILIZE⊗*. This BEING applies various knowledge "variables,"
24400 starting at specific ones and moving toward very general ones, until
24500 one of them reports it is able to acheive the desired goal.
24600
24700 ⊗4RESOLVE DECISION⊗*. Again, all demons must be inhibited. After some
24800 preliminary searching and very trivial theorem-proving fail, this
24900 BEING resorts to asking the user about how to resolve a given
25000 decision.
25100
25200 ⊗4ASK USER ABOUT⊗*. We determine the argument instantiations of the
25300 little piece of code we're worrying about, determine the type of
25400 decision to be made, and apply the specific knowledge variable for
25500 x-ing that type of decision. Here, we get x by examing who called
25600 this BEING and why. To write a specialized version of ask-user-about,
25700 we just write a standard print, read, and assign function, with the
25800 details left unspecified until the sample session is read in.
25900
26000 ⊗4BETTER⊗*. This function is used to compare two BEINGs, and decide
26100 which of them should gain control. It evaluates their WHEN parts,
26200 and if they tie it evaluates their complexity vectors. Note that
26300 "eval" here is not trivial: each dimension of the complexity vector
26400 of a BEING can be a little program which examines itself, other
26500 BEINGs, and the state of the world before deciding on a numerical
26600 answer to return.
26700
26800 ⊗5Handling of User Interrupts⊗*. There are several functions and
26900 BEINGs involved in this process. Initially, the user describes how
27000 often the system is to give him the opportunity to interrupt and
27100 query it. At each of these times, the HANDLE USER INTERRUPT function
27200 asks the user if he wants to interrupt; if so, PROCESS USER INTERRUPT
27300 is called to do the job. In addition to asking for pieces of any
27400 BEING, the user may request limited simulated execution of various
27500 pieces, and may order the current BEING to FAIL.
27600
27700
27800
27900 ⊗5D. High-level, problem-independent knowledge: how to write
28000 programs⊗*
28100
28200 ⊗4SERVE⊗*. Obtain information until some of it is "executable," then
28300 carry it out. The forgetful-user demon is activated. Without this
28400 top-level purpose, PUP6 sits contentedly, never wanting to accept a
28500 new task.
28600
28700 ⊗4WRITE PROGRAM⊗*. The user must be made aware that PUP6 is about to
28800 write a program, what kind of program it is, what its name is (this
28900 will force a get-name BEING call), and that its type has been
29000 examined (this will cause a study-type BEING call). Upon exit, he
29100 must be told that PUP6 has completed the task, and what its new
29200 capabilities are. To wite a program, one enters a loop, broken only
29300 when several completion conditions are all true simultaneously: the
29400 top-level task is now a BEING, there are no undefined sections of
29500 code, there are no warnings left about the code, there is no
29600 executable information anywhere, there is no new but unprocessed
29700 information, there are no decisions still pending (except those
29800 requiring "everything else" to be complete; e.g., the adaptation of
29900 output formats using a sample session). If we do break out of the
30000 loop, we must update the list of programs written,
30100 the list of what
30200 PUP6 can now do, the list of what the user may do, and then we must
30300 gather up the relevant new BEINGs and
30350 dump them onto a file. This latter
30400 task is done by SUPPORT&DUMP.
30500 In general,
30600 of course, we won't break out, so we activate all the current demons
30700 and go on. All the body of the loop is is one CHOOSE FROM, between
30800 six alternatives: obtaining some usable info, using some usable info,
30900 filling in some function call which is currently undefined,
31000 clarifying some little piece of code known to be improbable for some
31100 specific reason, adapting some known function to conform to some
31200 specific new requirements, fixing some piece of code which doesn't
31300 work the way it claims to work. The last two of these are simply
31400 program modification and debugging, respectively! Failure of one of
31500 these six BEINGs simply causes CHOOSEFROM to try another one; failure
31600 of a demon causes the whole WRITE PROGRAM BEING to fail. During its
31700 reign, the program-writing demons, deferral-demon, and
31800 reinvestigation-demon are all activated. Its complexity vector is
31900 dependent upon that of the BEING closest to the task it must perform.
32000
32100 ⊗4SUPPORT&DUMP⊗*. We find the set of support
32200 of the top-level task and create a new file with the relevant
32300 functions and BEINGs (which automatically does global initializations
32400 and then enters the top-level task instead of SERVE). The user is
32500 informed of all this, and allowed to interrupt and ask questions.
32600
32700 ⊗4OBTAIN USABLE INFORMATION⊗*. The WHEN part informs us that this is
32800 always undesirable (-10) but is OK (111) if there exists new but not
32900 yet usable information. All we do here is CHOOSE FROM the following
33000 four alternatives: translate something, get brand new information,
33100 analyze the implications of existing information, extract a small
33200 relevant subset of the existing information to concentrate on.
33300
33400 ⊗4USE INFORMATION⊗*. This demands that some executable information
33500 exist. We select one such piece, and try to execute it. If we fail,
33600 its worth is decreased; if we succeed, it is removed from the
33700 executable info assert list.
33800
33900 ⊗4FILL IN UNDEFINED SECTION⊗*. There must be some undefined section.
34000 If so (80) we don't want any hi-priority (≤20) coding warnings still
34100 around (-150), and we do like there to be something both undefined
34200 and known to be encodable (96). We fix a choice of what to encode,
34300 and try to acheive its encoding. If we fail, we update the difficulty
34400 of that choice, and may assert that we want some specific new
34500 information to relieve the problem. In addition to ENCODE, this
34600 BEING also may call MAKE ENCODABLE and STUDY TYPE.
34700
34800 ⊗4CLARIFY IMPROBABLE SITUATION⊗*. This BEING demands that something
34900 of mediocre priority (≤500) exist on the coding warning assert-list.
35000 It likes (51) this, and dislikes (-41) anything on the undefined
35100 section list, or anything (-200) on the encodable section list. As
35200 always, a sentence is provided to justify each of these little
35300 beliefs. We choose the warning with the highest priority (lowest
35400 numerical weight) on the coding warning list, note that that is what
35500 PUP6 is working on now, and do a match to decide what type of warning
35600 it is. (i) Replace x by y in z. Here these may be nonspecific; z may
35700 be "in all code recently generated". The nature of y may cause us to
35800 include new warnings; y may mention a new data structure. (ii) x in y
35900 is undefined; probably z since r. This may cause us to add to the
36000 global initialization list. It will probably cause us to ask the user
36100 what the answer is. (iii) x is a data structure but we don't know
36200 much about it. We try to find out its structure, how to initialize,
36300 access, insert, delete, update it. A variant of this warning is: (iv)
36400 We find no x's associated with data structure y. Here x can point to
36500 insertions, deletions, initializations, etc. If we can't justify the
36600 lack, we try to defer the decision. Failing that, we ask the user.
36700 (v) Command: if x then y. This is a programmed demon; when situation
36800 x is true, we must do y. (vi) Delete all mention of x. This is like a
36900 replace, but we go through the assert lists with an eye toward
37000 deleting unnecessary worries. (vii) Infinite loop in x from y to z.
37100 If we can't justify this, we insert a test to break out of the loop.
37200 Justification might be that this loop is in the top-level function of
37300 the system, where we never wnat to break out. (viii)
37400 Incomprehensible: x, y. (there is a "bug" in x manifesting itself as
37500 y) Never needed to write CF, so not implemented. Should call FIX
37600 INCORRECT PIECE (which is also not in yet) or ask the user for
37700 assistance.
37800
37900 ⊗4GET NEW INFORMATION⊗*. Naturally, it is not thrilled if (-68) there
38000 exists some new but unexamined information, and it is happy (50) if
38100 there is none. The prerequisites ensure that the user is aware of
38200 what PUP6 wants, and if the theorem prover can't deliver it, PUP6
38300 asks the user for some. If PUP6 asks for something general ("any
38400 task") it is because it knows precisely that this is what it wants,
38500 not out of ignorance! During execution, the specificity check demon
38600 is active; he ensures that it is indeed phrased as specifically as
38700 possible; if not, MAKE SPECIFIC will be called. This is a very
38800 uncomplicated BEING, and a very unpopular one to use since we should
38900 squeeze every drop of meaning out of what we have before asking for
39000 more information.
39100
39200 ⊗4EXTRACT RELEVANT SUBSET⊗*. This likes (70) there to be a great deal
39300 (≥50 pieces) of new information, and dislikes (-80) it if there are
39400 under a dozen such tokens. It finds and evaluates knowledge variables
39500 to constrain what should be looked at currently. This is never called
39600 in the dialog, though it was in the protocol.
39700
39800 ⊗4ANALYZE IMPLICATIONS⊗*. The WHEN part is unhappy (-60) if there is
39900 usable information already, since this BEING is fairly costly. It
40000 also examines the size of the new info list to see just how long this
40100 search will be. The BEING locates the code that will be affected,
40200 predicts the affects, and sees how desirable this is. This BEING is
40300 also never needed to write CF.
40400
40500 ⊗4MAKE ENCODABLE⊗*. If all else fails, this BEING tries replacing a
40600 function by one of its alternatives or, as a last resort, by one of
40700 its generalizations. This last resort is never needed to write CF.
40800
40900 ⊗4ENCODE⊗*. Despite its key name, this BEING is mostly a bookkeeper!
41000 It runs around the assert lists, gathering up enough information to
41100 encode a specified little piece of code. A program schema is built
41200 up, instantiated as much as possible, printed out for the user to
41300 refer to, and passed to a highly optimized recursive auxilliary
41400 function, GETCODE. Some worrying about the arguments is
41500 done,including what they might be instantiated as. We inform the user
41600 of the code BEING written, and a prerequisite causes the function to
41700 become made into a BEING.
41800
41900 ⊗4STUDY TYPE⊗*. This BEING accepts a BEING call, looks at that
42000 BEING's SPECIALIZATIONS part, translates each entry into a decision,
42100 and dumps these onto the undeferred decision list. A deferral demon
42200 promptly attends to them.
42300
42400 ⊗4GET DATA STRUCTURE⊗*. We call select-structure type, and use its
42500 advice to point to the proper knowledge variable. We eval it to set
42600 up the various parts of the data structure. In unclear cases, we may
42700 ask the user whether the argument really is a data structure. We
42800 ensure that this object is a BEING (else we make it one,) and we add
42900 warnings to the effect that there might not be any insertions or
43000 deletions; we'll worry about that much later, by another BEING. We
43100 know that the elements of a data structure are themselves usually
43200 data structures, so one the prerequisites says that we must try to
43300 make the arguments into data structures as well.
43400
43500 ⊗4GETCODE⊗*. This is a long, special-case, recursive function. It
43600 goes through a piece of code with two jobs: to build up a list of
43700 arguments to this function, and to get names for specific instances
43800 of general BEINGs. Part of the kludgy character is due to the fact
43900 that there are non-BEINGs in the code: primitive forms (structure,
44000 tuple, vector, class, comment, atoms), primitive functions (read,
44100 null, and, ...), primitive variables (T, NIL, 4, ...). By keeping
44200 closely to the theoretical ideals implicit in the ideas, this
44300 function would probably become trivial.
44400
44500 ⊗4GET NAME⊗*. First we study plausible names for the new specialized
44600 BEING. We make the user aware of them. We examine the BEING name
44700 carefully, to see if it was just mentioned in the same piece of code
44800 (probably then the same name), whether it's ever been mentioned and
44900 specialized, and so on. Sometimes we end up deciding not to get a new
45000 name, sometimes we pick one and just tell the user, sometimes we
45100 recommend some old specialized name. In 90% of the cases, though, we
45200 simply say "give me a name for ...". A new-name counter is maintained
45300 to ensure unique names, and this number is appended onto the end of
45400 what the user types, unless it's an already-existing BEING name. The
45500 user my type ] (and usually does), indicating that PUP6 is to choose.
45600 PUP6 always informs the user what the name is at the end.
45700
45800 ⊗4MAKE NEW BEING⊗*. This BEING has the awesome responsibility of
45900 giving life to new BEINGs. As is the case with neurons, soldiers, and
46000 all (good) BEINGs, no one BEING really does much when you look at it
46100 in isolation. Thus this one already gets the name of the BEING, the
46200 name of its parent (its least general generalization), how the
46300 metacodes of these two differ, the argument list, the reason for this
46400 specialized BEING to exist, what extra qualities it possesses or
46500 lacks wrt its parent, etc. It can figure out who it affects by
46600 examing its various parts, and it bases the complexity vector on that
46700 of the parent and on the changes in this new BEING. Thus it basically
46800 does gets, evals, and puts. It updates various assert lists, and
46900 semi- compiles the new BEING. One bit of knowledge says that the
47000 explicit args check may be significantly different, and the user may
47100 be queried.
47200
47300 ⊗4SEMI COMPILE⊗*. Basically, BEINGs only lend themselves to
47400 interpreting; to help speed up this process, this BEING can take
47500 advantage of regularities and simplicities in another BEING's parts,
47600 and turn out a fast function to do an equivalent job. For example,
47700 if a BEING doesn't enable any new demons, we needn't push the demon
47800 stack at the beginning and pop it at the end.
47900
48000
48100 ⊗5E. Low-level problem-independent knowledge: bits of
48200 programs⊗*
48300
48400 ⊗4PROPOSE PLAUSIBLE NAMES⊗*. This BEING is called by GetName, and
48500 should incorporate a good model of user psychology of name-choosing.
48600 Currently, it applies Initials, Mainwords, Firstfew, and compositions
48700 of these, to the task description sentence.
48800
48900 ⊗4SELECT STRUCTURE TYPE⊗*. This is a true bit of programming
49000 knowledge: if the structure is composed of several weakly-interacting
49100 pieces tied together through association with one atom, then the data
49200 structure should be a list of these atoms, with the associated
49300 structures BEING stored on the property lists of the atoms. If there
49400 are only a couple pieces, or they interact strongly, we should use a
49500 nested list structure instead.
49600
49700 ⊗4ELEMENT⊗*. All we know about an element is that it is a member of a
49800 data structure, and that we should not be ashamed to ask the user to
49900 clarify himself if he mentions this. The ConceptFormation BEING --
50000 not the ELEMENT BEING -- should note that future references to
50100 ELEMENT actually refer to a scene, an instance of a concept, and that
50200 references to class refer to the model of a concept, a set of scenes.
50300 This may change as new data structures come into existence.
50400
50500 ⊗4MODIFY STRUCTURE⊗*. Generally, we will be given a typical element,
50600 and must identify the structures it belongs to, and modify them. The
50700 precise details indicate that some subset of CONDITIONAL INSERTION,
50800 CONDITIONAL DELETION, and COMPLEX ALTERATION will be involved.
50900
51000 ⊗4GET HOLD OF⊗* by guessing and checking. We must discover whether an
51100 algorithm already exists which can get arg1. If not, we try to find
51200 one which can get arg1 and some other effects. We structure the
51300 function as some of COMPUTE, GENERATE&TEST, and SEARCH. Finally, we
51400 must decide now on the type of error recovery desired: none,
51500 backtrack, or correct-and-try-to-proceed.
51600
51700 ⊗4TAKE HOLD OF⊗* trivially. We examine the flow of control through
51800 the preceding code, to decide whether arg1 has ever been SET. If so,
51900 we must ask the user whether or not to read in a new value. If no
52000 read is to be done, then this BEING reduces to a simple assignment,
52100 or perhaps to nothing at all. Otherwise, we read in the object, and
52200 assign its various subparts to SOME PART OF it.
52300
52400 ⊗4IS OF TYPE⊗*. This BEING is a predicate which is too low-level and
52500 general to do much. Basically, it helps formulate a question to the
52600 user, who must explain to PUP6 how to construct any predicate it
52700 comes across, usually just by translating a sentence the user types
52800 in.
52900
53000 ⊗4COMPLEX ALTERATION⊗*. This BEING is always replaced by some
53100 initializing assignments, followed by calls on MODIFY STRUCTURE for
53200 some subparts of arg1. A bit of advice is that if arg1 is
53300 unstructured, try to get it structured. If this fails, maybe what is
53400 really wanted is to modify the structure of which arg1 is a member.
53500
53600 ⊗4CONDITIONAL DELETION⊗*. As above, we look at the structuring of
53700 various arguments to decide what is really supposed to be deleted
53800 from where. We check with the user, remind him of various bindings
53900 relevant to the current call, and ask him to describe (1) under what
54000 conditions we do the deletion, (2) what exactly do we delete. These
54100 are translated, analyzed in the context of deletion, and help
54200 determine the deletion piece of code.
54300
54400 ⊗4CONDITIONAL INSERTION⊗*. This is similar to the preceding BEING.
54500 Here we also worry about whether the entity to be inserted has ever
54600 been bound. If not, we must see that it is! Often, this binding will
54700 be related to the Initialize piece of the DataStructure part of the
54800 BEING representing the structure we're inserting into. Since some
54900 data structures have several similar but distinctly-named elements
55000 existing simultaneously, we have lots of little worries. After we
55100 have planned out the code, we check with the coding warning list and
55200 add to it; e.g., any undefined initial value of a variable in a piece
55300 of code we stuck in here, will also be uninitialized here. If we
55400 later decide never to worry about the first initialization, we must
55500 not forget this one! This is a frequent source of bugs, I think.
55600
55700 ⊗4GENERATE&TEST⊗* and ⊗4COMPUTE⊗* are not needed or implemented.
55800
55900 ⊗4SEARCH⊗*. Currently, a primitive type of searching suffices. We
56000 simply execute the iteration FOREACH X IN XSET DO (TEST X). If we're
56100 unsure, we check that XSET has the form SET OF (PLURAL X). The
56200 specializations tell us to worry about termination conditions. I
56300 suppose this BEING could also be used for generate-and-test.
56400
56500 ⊗4FOREACH⊗*. Not surprisingly, this is the iteration BEING. It is an
56600 NLAMBDA function, so its arguments aren't surrounded by quotes. There
56700 are various minor decisions to make, which simplify any specialized
56800 version: there may or may not be an "until" condition; we must get it
56900 and also decide what to bind the iteration variable to and what value
57000 to return, both in case this condition is met, and in case we exhaust
57100 the space. Often, we will decide to leave some of these unspecified,
57200 put notes about them on the coding warning list, and not worry about
57300 them for a long time.
57400
57500 ⊗4TEST⊗*. This BEING is a predicate which is oriented toward a
57600 threshold of acceptability, whereas IS_OF_TYPE is oriented toward
57700 separating cases. It either has the flavor of comparing, or of
57800 competition. We must also decide whether the result is nominal,
57900 ordinal, or of ratio caliber. These latter two never crop up, which
58000 is why we assume the test is a predicate always.
58100
58200 ⊗4SOME PART OF⊗*. We either get this simple destructive function by
58300 examples (like Shaw's program) or by translating a user-supplied
58400 English sentence describing what it does. Typically, some
58500 combinations of CAR and CDR, occasionally uses some constants (1, T,
58600 NIL) or constructive primitives (CONS, NCONC).
58700
58800 ⊗4COMPARE⊗*. We must worry about whether the result is nominal,
58900 ordinal, or ratio. We also decide whether the comparison is a JOINING
59000 FUNCTION applied to its arguments, a constant function (executed only
59100 for side effects), or a JOINING FUNCTION applied to the results of
59200 COMPAREing corresponding subparts of the arguments. This last case is
59300 both frequent and complicated. If neither argument is structured,
59400 this is impossible. If both are highly structured, their structures
59500 must have a nontrivial amount of correspondence in order to succeed.
59600 If only one argument is structured, this strongly suggest that the
59700 other one should be similarly structured. Often we go ahead and
59800 structure it without asking the user (inference by analogy.)
59900
60000 ⊗4BETTER⊗*. We've discussed this earlier. Here we should note that
60100 when called on to write a new, specialized BETTER function, it
60200 chooses either a simple function (T, NIL) to allow an optimization
60300 (replace by CONS, replace by NCONC1), or it chooses a complex
60400 comparing function (e.g., alphorder, write-program itself!).
60500
60600 ⊗4JOINING FUNCTION⊗*. This is a way of condensing results. It is
60700 typically a known function, such as AND, OR, PLUS, MAXIMUM, PROGN...
60800 which is determined by (i) the character of the result (numerical,
60900 T/F) and (ii) whether we are in a DO UNTIL loop, a DO REPEATEDLY
61000 loop, or neither of these loops.
61100
61200
61300 ⊗5F. Programming Knowledge stored in variables⊗*
61400
61500 To resolve an ADAPTATION decision: Ask for sample dialog,
61600 symbolically run current code, modifying I/O formats.
61700
61800 To defer any decision whose AFFECTS part is known: It may
61900 translate as some detail of x; in that case, wait until some code for
62000 x already exists. The affects part may translate as The x algorithm;
62100 in this case we worry as soon as PUP6 begins DO-ing any coding at all
62200 of x.
62300
62400 To defer an ALTERNATIVES decision: Examine the coding tasks
62500 in all cases, and try to find some common head and/or some common
62600 tail. If there is any, try to defer the decision until after writing
62700 code for this common subtask sequence. If one alternative exactly
62800 matches the common sequence, we can temporarily assert that the
62900 decision has been made to do this simplest BEING.
63000
63100 To terminate an AND loop: The conjunction is usually between
63200 highly similar objects. Related to how to parse a sentence containing
63300 ANDs.
63400
63500 To resolve a BOOLEAN decision: Ask the user to respond Yes or
63600 No. The decision has special Yes, No, and Both parts; in each case,
63700 two of them will be evalled.
63800
63900 To resolve a DEFINITION decision: Locate the defined object,
64000 reaffirm that it is undefined, ask the user to define it, check
64100 whether it is a predicate, data structure, etc., and tell the user
64200 about such constraints.
64300
64400 To resolve a DICHOTOMY decision: Each such decision contains
64500 a little program which now is evaluated. If it succeeds, its value
64600 answers the decision; if it fails, we have to ask the user to choose
64700 alternative 1 or 2. The choice points to a little program to run, and
64800 its value is the desired resolution.
64900
65000 To resolve a PREDICATE decision: Fix the context of the
65100 predicate call, try to relate this predicate to some known tests,
65200 and, if this fails, ask the user. His response is specially processed
65300 in the context of a predicate fixation.
65400
65500 To set up a data structure stored as a LIST STRUCTURE: The
65600 initialization is a simple SETQ to an as-yet-undefined value. Access
65700 is simply by mentioning the name. Insertion is by Merge:in or Merge.
65800 Deletion is by Pullout or Setdifference.
65900
66000 To set up a data structure stored as a PROPERTY LIST
66100 STRUCTURE: Delineate the pieces to be associated with each atom, and
66200 name them. Accession is via GETP. Insertion s via a GETP, then a
66300 MERGE:IN, then a PUT. Deletion is via a GETP, then a PULLOUT, then a
66400 PUT. Initialization is a PUT of a SETQ to an as-yet undefined value.
66500 Each named substructure must itself become a data structure,
66600 typically a simple list structure.
66700
66800 To resolve a SOMEOF decision: The user must be sufficiently
66900 informed about the choices. He types back a non-null, ordered subset
67000 of those choices. We examine them to see if they should be enclosed
67100 in a PROGN or in a REPEATEDLY, and whether we would like to see
67200 something structured in a certain way.
67300
67400 To resolve a SUBSETOF decision: Similar to above, but no
67500 fancy investigation, just string the choices together in a PROGN.
67600
67700 To defer any decision whose WHEN part is provided: We
67800 transform "before deciding firmly how to ←x ←←y" into something like
67900 "member (cons detail-of-$x-ing $y) doing-pup-list". We also translate
68000 Before any ←←x routine is finalized, After ←←x routines are
68100 finalized, and similar phrases. These must always evaluate T or NIL.
68200
68300
68400 ⊗5G. Demons and functions of interest⊗*
68500
68600 LIST:JOIN, MERGE:IN, PULLOUT. These rather standard functions are
68700 given a tiny bit of advice: if their "element" is more like a sublist
68800 than an element of their "list", then they assume that what was meant
68900 was append or merge or setdifference, not cons or insert or remove.
69000
69100 LONG NAME DEMON. If any name becomes unwieldy, he notices it and asks
69200 the user for a nickname.
69300
69400 PERMIT DETAILED DECISION. Implicit near the beginning of each
69500 decision, PUP6 called this function. It asks the user if he wants
69600 more details, and if so it gets them and prints them out.
69700
69800 STRUCTURE INDUCING DEMON. If the object is a BEING already, special
69900 considerations apply. If the object and all arg values are
70000 ill-defined, we decide not to do any structuring. Otherwise, we
70100 investigate the effects of structuring the object into the pieces
70200 specified in the args. If there is no problem, and if the user
70300 consents, we tack the appropriate Replace messages onto the coding
70400 warning list (with a high priority). We activate Long Name Demon.
70500
70600 ⊗5Natural Language Translation⊗*. We have already discussed the
70700 TRANSLATE BEING and the basic way of handling natural language input.
70800 Several BEINGs exist primarily for this purpose; RECOGNIZE:ARGS,
70900 RECOGNIZE:C*R, RECOGNIZE:CONDITIONAL, RECOGNIZE:CONJUNCTION,
71000 RECOGNIZE:EQUALITY, RECOGNIZE:FUNCTION:RETURNS, RECOGNIZE:INCLUSION,
71100 RECOGNIZE:LITERALS, RECOGNIZE:NUMBER, RECOGNIZE:SET:RELATIONS,
71200 RECOGNIZE:SOME:MEMBER, ADD:DEFINITION, ADJECTIVE:HANDLER, REPEATEDLY.
71300 Also, there are several functions related to translation: e.g.,
71400 UNGERUNDIFY, PLURAL, OPPOSITE. All these are straight-forward, and
71500 their task is obvious from their name.
00100
00200 .SELECT 1
00300
00400 ⊗5(ii) The increment of knowledge necessary to write GI⊗*
00500
00600 ⊗4APPLY RULE⊗*. This BEING accepts two arguments, which must be
00700 interpretable as a string and a rule, respectively. The string is
00800 compared against the left side of the rule and, if applicable, the
00900 change indicated by the right side is executed on the string. It is
01000 immediately encodable if the user wishes all possible applications of
01100 the rule to the string; else a more specialized version must be
01200 synthesized.
01300
01400 ⊗4CONSTRAIN⊗*. Try first to decide if it is meaningful for the given
01500 structure to be constrained, and, if so, how. Next see whether any
01600 other BEINGs can help. Finally, ask the user to specify any
01700 constraints he can think of. For example, can a list structure grow
01800 indefinitely, or can we use some fancy programming trick because the
01900 size stays small?
02000
02100 ⊗4GRAMMATICAL INFERENCE⊗*. Needless to say, this is a high-level,
02200 domain-specific BEING! It must infer grammars from exemplary strings;
02300 it knows this to be the only reasonable g.i. paradigm. If it fails,
02400 some genralizations are inductive inference, enumeration,
02500 problem-solving; some alternatives are concept formation and
02600 simulated evolution. There are many minor decisions, similar to the
02700 concept formation decisions (e.g., examine a sample GI-user dialogue
02800 to finalize the printing formats). The major decision is dichotomous:
02900 whether our new specialized BEING should retain the ability to input
03000 the type of grammar and vary its strategy accordingly, or whether
03100 only one fixed type of grammar (e.g., context-free) will always be
03200 used, and may be "built in" to the code. The result of this decision
03300 is to pass control to one or the other of the following two BEINGS.
03400
03500 ⊗4INFER MULTI-CLASS GRAMMARS⊗*. We read in the type of grammar, and
03600 then call the appropriate specialist for that type. Thus we have a
03700 big switch here.
03800
03900 ⊗4INFER FIXED-CLASS GRAMMARS⊗*. This routine determines at
04000 program-synthesis time what the class is going to be, and thus will
04100 be a fixed call to one of the following four BEINGS. The speedups
04200 will arise from using the constraints on the rules.
04300
04400 ⊗4INFER PHRASE-STRUCTURE GRAMMARS⊗*. There are no rule constraints in
04500 a type 0 grammar; each half of the rule is viewed as an arbitrary
04600 list of letters. When a TEST is indicated, the
04700 fringe-of-consciousness demon must report it is thinking of PARSE. The
04800 left and right sides of a rule will be destructive operations on the
04900 data representation of a rule.
05000
05100 ⊗4INFER CONTEXT-SENSITIVE GRAMMARS⊗*. We shall only report on the
05200 differences between the INFER... BEINGS. This one knows that the
05300 right side of the rule must be at least as long as the left side.
05400 This will be used as a pruning heuristic when proposing plausible
05500 rules.
05600
05700 ⊗4INFER CONTEXT-FREE GRAMMARS⊗*. Grammars of type 2 have as their
05800 left side a single nonterminal. Further simplifications can occur by
05900 only working toward a Griebach Normal Form or Chomsky Normal Form
06000 grammar, although from the standpoint of inference energy these are
06100 harder.
06200
06300 ⊗4INFER REGULAR GRAMMARS⊗*. A type three grammar has a unit on the
06400 left and a pair of them for a right side (one terminal, one
06500 nonterminal). This is a very powerful pruning heuristic.
06600
06700 ⊗4MAJOR MODIFY STRUCTURE⊗*. The old, simple "insert,delete,alter"
06800 paradigm of modification was no longer sufficient. This BEING heads a
06900 whole complex of modify BEINGS, including the old one as the
07000 low-level workhorse primitive. Here is a sketch of the organization:
07100 .BEGIN NOFILL
07200 MAJOR-MODIFY-STRUCTURE
07300 / | \
07400 MODIFY-UNTIL | MODIFY-SOME
07500 \ | /
07600 THE-ORIGINAL-MODIFY-STRUCTURE-BEING
07700 .END
07800 So this top-level modifier calls some subset of the three lower
07900 modification BEINGS. Later, we had to add a fourth alternative,
08000 EXAMINE DATA STRUCTURE, to aid in writing the INF program.
08100
08200 ⊗4MODIFY SOME⊗*. We determine a set S and a predicate P at synthesis
08300 time. At run time, we map through S and apply P; all elements
08400 responding positively are modified (using the original MODIFY
08500 STRUCTURE.) The decisions about P and S are easily deferrable.
08600
08700 ⊗4MODIFY UNTIL⊗*. This BEING is simply an order to compose
08800 REPEATEDLY π. MODIFY_STRUCTURE. The former BEING bears the brunt of
08900 the responsibility of the interface.
09000
09100 ⊗4PARSE⊗*. Attempt to parse a string from the current set of rules,
09200 by reversing each rule and composing their applications. We decide
09300 at synthesis time whehter or not to maintain a parse tree, whether or
09400 not to maintain a list of the rules used during the parse, whether we
09500 stop parsing at any legal string or only at "S," whether we parse
09600 forwards or backwards or both, how deeply in each direction (this is
09700 always deferred until much later), whether one direction after
09800 another or (simulated) simultaneously. We look for the DataStructure
09900 part of the BEING representing a rule, and ask him about constraints
10000 on the rule, and about how to destructively recover the left and
10100 right sides separately.
10200
10300 ⊗4PARSE BACKWARD⊗*. This BEING is given two strings and a set of
10400 rules; the task is to apply anti-rules to the target string until it
10500 becomes the initial string. This is typically done breadth-first.
10600 Special modifications must be made if there is a parse tree to
10700 maintain, if a set of rules used must be maintained, etc. The basic
10800 body is a nest of FOREACH calls (∀rule, ∀way of applying the rule to
10900 the string, recurse). To avoid infinite recursion, we must supply a
11000 third argument: the depth to which we compose these anti-rules before
11100 we give up. When calling itself recursively, this level will be
11200 decremented.
11300
11400 ⊗4PARSE FORWARD⊗*. This BEING is analogous to the previous one, using
11500 rules themselves instead of anti-rules. Notice how clearly the place
11600 to insert searching heuristics is marked out for us (although none
11700 are present.)
11800
11900 ⊗4STRING⊗*. This is a structure whose parts are a name, a list of
12000 letters, a set of comments. It is advisable to use list structures
12100 rather than property lists to represent strings, since they will
12200 probably only be accessed by one of their three parts. In the GI
12300 program, we don't use STRING itself, but rather we mention
12400 UNCOMMENTED STRING, which causes this BEING to create a new
12500 specialized version of itself, sans the third, comments part.
00100
00200 ⊗5(iii) The increment of knowledge necessary to write INF⊗*
00300
00400 ⊗4EXAMINE STRUCTURE⊗*. This is another one of the parts of a major
00500 MODIFY structure. If the fringe of consciousness demon can't come up
00600 with a reasonable matching function, one is selected now. The basic
00700 body says to do PATTERN MATCH, using this match function, and convey
00800 the results to the caller (who may be the user.) The inputs are thus
00900 a pattern, a data structure name, and possibly a hint to a match
01000 function.
01100
01200 ⊗4PATTERN MATCH⊗*. This existed as a system function earlier, but for
01300 INF it is necessary to write a tailored pattern-matcher. In
01400 particular, INF demands that we strip away the common head and tail
01500 from both pattern and expression, and then compose the two remaining
01600 pieces into the left and right sides of a new plausible rule, and
01700 then check that this conforms with the constraints on rules. This is
01800 certainly different from the type of match needed by CF! Notice that
01900 we had to add the "eliminate common head/tail" functions to our list
02000 of system primitive functions.